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Abstract 

We study a model of communication complexity that encompasses many well-studied problems, including clas- 
sical and quantum communication complexity, the complexity of simulating distributions arising from bipartite mea- 
surements of shared quantum states, and XOR games. In this model, Alice gets an input x, Bob gets an input y, and 
their goal is to each produce an output a, b distributed according to some pre-specified joint distribution p{a, b\x, y). 
Our results apply to any non-signaling distribution, that is, those where Alice's marginal distribution does not depend 
on Bob's input, and vice versa. 

By introducing a simple new technique based on affine combinations of lower-complexity distributions, we give 
the first general technique to apply to all these settings, with elementary proofs and very intuitive interpretations. The 
lower bounds we obtain can be expressed as linear programs (or SDPs for quantum communication). We show that the 
dual formulations have a striking interpretation, since they coincide with maximum violations of Bell and Tsirelson 
inequalities. The dual expressions are closely related to the winning probability of XOR games. Despite their apparent 
simplicity, these lower bounds subsume many known communication complexity lower bound methods, most notably 
the recent lower bounds of Linial and Shraibman for the special case of Boolean functions. 

We show that as in the case of Boolean functions, the gap between the quantum and classical lower bounds is at 
most linear in the size of the support of the distribution, and does not depend on the size of the inputs. This translates 
into a bound on the gap between maximal Bell and Tsirelson inequality violations, which was previously known only 
for the case of distributions with Boolean outcomes and uniform marginals. It also allows us to show that for some 
distributions, information theoretic methods are necessary to prove strong lower bounds. 

Finally, we give an exponential upper bound on quantum and classical communication complexity in the simul- 
taneous messages model, for any non-signaling distribution. One consequence of this is a simple proof that any 
quantum distribution can be approximated with a constant number of bits of communication. 

1 Introduction 

Communication complexity of Boolean functions has a long and rich past, stemming from the paper of Yao in 
1979 l|Yao79l . whose motivation was to study the area of VLSI circuits. In the years that followed, tremendous 
progress has been made in developing a rich array of lower bound techniques for various models of communication 
complexity (see e.g. [KN97J). 

From the physics side, the question of studying how much communication is needed to simulate distributions 
arising from physical phenomena, such as measuring bipartite quantum states, was posed in 1992 by Maudlin, a 
philosopher of science, who wanted to quantify the non-locality inherent to these systems |.Mau92 1 . Maudlin, and the 
authors who followed IIBCT99I [SteOOl ITB03I ICGMP051 IDLR07II (some independently of his work, and of each other) 
progressively improved upper bounds on simulating correlations of the 2 qubit singlet state. In a recent breakthrough, 
Regev and Toner [RT07| proved that two bits of communication suffice to simulate the correlations arising from 
two-outcome measurements of arbitrary-dimension bipartite quantum states. In the more general case of non-binary 
outcomes, Shi and Zhu gave a protocol to approximate quantum distributions within constant eiTor, using constant 
communication | SZ08 |. No non-trivial lower bounds are known for this problem. 

In this paper, we consider the more general framework of simulating non-signaling distributions. These are distri- 
butions of the form p{a, b\x, y), where Alice gets input x and produces an output a, and Bob gets input y and outputs b. 
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The non-signaling condition is a fundamental property of bipartite physical systems, which states that the players gain 
no information on the other player's input. In particular, distributions arising from quantum measurements on shared 
bipartite states are non-signaling, and Boolean functions may be reduced to extremal non-signaling distributions with 
Boolean outcomes and uniform marginals. 

Outside of the realm of Boolean functions, a very limited number of tools are available to analyse the commu- 
nication complexity of distributed tasks, especially for quantum distributions with non-uniform marginals. In such 
cases, the distributions live in a larger-dimensional space and cannot be cast as communication matrices, so standard 
techniques do not apply. The structure of non-signaling distributions has been the object of much study in the quantum 
information community, yet outside the case of distributions with Boolean inputs or outcomes |JM05 BP05 1, or with 
uniform marginal distributions, much remains to be understood. 

Our main contribution is a new method for handling all non-signaling distributions, including the case of non- 
Boolean outcomes and non-uniform marginals, based on affine combinations of lower-complexity distributions, which 
we use to obtain both upper and lower bounds on communication. We use the elegant geometric structure of the 
non-signaling distributions to analyse the communication complexity of Boolean functions, but also non-Boolean or 
partial functions. Although they are formulated, and proven, in quite a different way, our lower bounds turn out to 
subsume Linial and Shraibman's factorization norm lower bounds |LS09|, in the restricted case of Boolean functions. 
Similarly, our upper bounds extend the upper bounds of Shi and Zhu for approximating quantum distributions PSZOS] 
to all non-signaling distributions (in particular distributions obtained by protocols using entanglement and quantum 
communication). 

Our complexity measures can be expressed as linear (or semidefinite) programs, and when we consider the dual of 
our lower bound expressions, these turn out to correspond precisely to maximal Bell inequality violations in the case 
of classical communication, and Tsirelson inequality violations for quantum communication. Hence, we have made 
formal the intuition that large Bell inequalities should lead to large lower bounds on communication complexity. 

We also show that there cannot be a large gap between the classical and quantum expressions. This was previously 
known only in the case of distributions with Boolean outcomes and uniform marginals, and followed by Tsirelson's 
theorem and Grothendieck's inequality, neither of which are known to extend beyond this special case. This also 
shows that our method, as was already the case for Linial and Shraibman's bounds, cannot hope to prove large gaps 
between classical and quantum communication complexity. While this is a negative result, it also sheds some light on 
the relationship between the Linial and Shraibman family of lower bound techniques, and the information theoretic 
methods, such as the recent subdistribution bound [JKN08 1, one of the few lower bound techniques not known to 
follow from Linial and Shraibman. We give an example of a problem IIBCT99I for which rectangle size gives an 
exponentially better lower bound than our method. 

Summary of results The paper is organized as follows. In Section|2] we give the required definitions and models of 
communication complexity and characterizations of the classes of distributions we consider. 

In Section |3] we prove our lower bound on classical and quantum communication, and show that it coincides with 
Linial and Shraibman's method in the special case of Boolean functions (Theorem [T3]l. 

Our lower bounds are linear programs (respectively, SDPs in the quantum case), and in Section]?] we show that 
the dual linear programs (resp. SDPs) have a natural interpretation in quantum information, as they coincide with 
Bell (resp. Tsirelson) inequality violations (Theorem llTli. We also give a dual expression which also has a natural 
interpretation, as the maximum winning probability of an associated XOR game (Corollary [19]|. The primal form 
is also the multiplicative inverse of the maximum winning probability of the associated XOR game, where all inputs 
have the same winning probability. 

In Section |5] we compare the two methods and show that the quantum and classical lower bound expressions can 
differ by at most a factor that is linear in the number of outcomes. (Theoreml22ll. 

Finally, in Section |6] we give upper bounds on simultaneous messages complexity in terms of our lower bound 
expression (Theoreml26]l. We use fingerprinting methods OBC WdO 1 1 1 Yao03l |SZ08 GKd06 | to give very simple proofs 
that classical communication with shared randomness, or quantum communication with shared entanglement, can be 
simulated in the simultaneous messages model, with exponential blowup in communication, and in particular that any 
quantum distribution can be approximated with constant communication. 
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Related work The use of affine combinations for non-signaling distributions has roots in the quantum logic com- 
munity, where quantum non-locality has been studied within the setting of more general probability theories IIFR8 1 1 
|RF81 KRF87, Wil92J. Until recently, this line of work was largely unknown in the quantum information theory 
community ||Bar071lBBLW07| . 

The structure of the non-signaling polytope has been the object of much study. A complete characterization of 
the vertices has been obtained in some, but not all cases: for two players, the case of binary inputs |BLM+05|, and 
the case of binary outputs IIBP05I IJM05I are known, and for n players, the case of Boolean inputs and outputs is 
known ||BP05J . 

The work on simulating quantum distributions has focused mainly on providing upper bounds, and most results ap- 
ply to simulating the correlations only. A few results address the simulation of quantum distributions with non-uniform 
marginals. Bacon and Toner give an upper bound of 2 bits for non-maximally entangled qubit pairs fTB03|. Shi and 
Zhu I SZ08 1 show a constant upper bound for approximating any quantum distribution (including the marginals) to 
within a constant. 

Pironio gives a general lower bound technique based on Bell-like inequalities IIPir03l . There are a few ad hoc lower 
bounds on simulating quantum distributions, including a linear lower bound for a distribution based on Deutsch-Jozsa's 
problem IIBCT99L and a recent lower bound of Gavinsky ||Gav09L 

The 72 method was first introduced as a measure of the complexity of matrices I'I^SSUTJ. It was shown to be 
a lower bound on communication complexity MLS 09 1 , and to generalize many previously known methods. Lee et al. 
use it to establish direct product theorems and relate the dual norm of 72 to the value of XOR games ||LSv08| . Lee 
and Shraibman |LS08| use a multidimensional generalization of a related quantity /i (where the norm-1 ball consists 
of cylinder intersections) to prove a lower bound in the multiparty number-on-the-forehead-model, for the disjointness 
function. 

2 Preliminaries 

In this paper, we extend the framework of communication complexity to non-signaling distributions. This framework 
encompasses the standard models of communication complexity of Boolean functions but also total and partial non- 
Boolean functions and relations, as well as distributions arising from the measurements of bipartite quantum states. 
Most results we present also extend to the multipartite setting. 

2.1 Non-signaling distributions 

Non-signaling, a fundamental postulate of physics, states that any observation on part of a system cannot instan- 
taneously affect a remote part of the system, or similarly, that no signal can travel instantaneously. We consider 
distributions p(a, h\x, y) where x G A", y G ^ are the inputs of the players, and they are required to each produce 
an outcome a ^ A,h ^ B, distributed according to p{a, h\x^ y). We restrict ourselves to the distributions where each 
player's outcome does not depend on the other player's input. Mathematically, non-signaling (also called causality) is 
defined as follows. 

Definition 1 (Non-signaling distributions). A bipartite, conditional distribution p is non-signaling if 

\/b,x,x',y, Y.aP{<^Mx^y) = Y.aP^°-M^' ^y)- 

For any non-signaling distribution, the marginal distribution on Alice's output p(a|x, y) — J^bPi'^^ ^1^' v) "^^^^ 
not depend on y, so we write p{a\x), and similarly p{b\y) for the marginal distribution on Bob's output. We denote by 
C the set of all non-signaling distributions. 

In the case of binary outcomes, more specifically, A = B = {±1}, it is known that a non-signaling distribution is 
uniquely determined by the (expected) correlations, defined as C{x, y) = E{a ■ h\x, y), and the (expected) marginals, 
defined as Ma{x) = E{a\x),MB{y) = E{b\y). 
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Proposition 1. For any functions C : X x y — [—1,1], Ma ■ X — ^ [— 1,1]> Mb ■ y — > [— 1, 1]> satisfying 
I + a ■ b C{x,y) + aMA^x) + hMB{y) > V(a;, y) ^ X x y and a,b Cz {±1}, there is a unique non-signaling 
distribution p such that V x, y, E(a • h\x, y) = C{x, y) and E{a\x) = Ma{x) and E{b\y) = Msiy), where a, b are 
distributed according to p. 

Proof. Fix X, y. C, Ma, M b are obtained from p by the following full rank system of equations. 
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Computing the inverse yields p(a, b\x, y) — j{l ~V a ■ b C{x,y) + aMA{x) + bMB{y))- □ 

We will write p = {C, Ma, Mb) and use both notations interchangeably when considering distributions over 
binary outcomes. We also denote by Cq the set of non-signaling distributions with uniform marginals, that is, p = 
(C, 0, 0), and write C G Cq, omitting the marginals when there is no ambiguity. 

2.1.1 Boolean functions 

The communication complexity of Boolean functions is a special case of the problem of simulating non-signaling 
distributions. As we shall see in Section |231 it happens that the associated distributions are extremal points of the 
non-signaling polytope. If the distribution stipulates that the product of the players' outputs equal some function 
f : X X y ^ {±1} then this corresponds to the standard model of communication complexity (up to an additional 
bit of communication, for Bob to output f{x, y)). If we further require that Alice's output be +1 or -1 with equal 
probabiUty, likewise for Bob, then the distribution is non-signaling and has the following form: 

Definition 2. For a function f : X x y ^ {^Ij 1}. denote py the distribution defined by pf{a,b\x,y) — \ if 
f{x,y) = a ■ b and otherwise. Equivalently pf = (C/,0,0) where Cf{x,y) = f{x,y). 

In the case of randomized communication complexity, a protocol that simulates a Boolean function with error prob- 
ability e corresponds to simulating correlations C scaled down by a factor at most 1— 2e, that is, Vx, y, sgn{C'{x, y)) — 
Cf{x, y) and | C"(a;, y) |> 1 — 2e. While we will not consider these cases in full detail, non-Boolean functions, partial 
functions and some classes of relations may be handled in a similar fashion, hence our techniques can be used to show 
lower bounds in these settings as well. 

2.1.2 Quantum distributions 

Of particular interest in the study of quantum non-locality are the distributions arising from measuring bipartite quan- 
tum states. We will use the following definition: 

Definition 3. A distribution p is quantum if there exists a quantum state lip) in a Hilbert space Ti and measurement 
operators {Ea(x) : a £ A,x € X} and {Eb{y) : b £ B,y £ y}, such that p{a,b\x,y) — {tp\Ea{x)Eb{y)\ip), with 
the measurement operators satisfying 

1. Ea{xy = Ea{x) andEbiyY = Eb{y), 

2. Eaix) ■ Ea'ix) = Saa'Eaix) and Ebiy) ■ Eb'iy) = 5bb'Eb{y), 

3. J2a ^a{x) = 1 and Eb{x) — 1, where 1 is the identity operators on H, 

4. Ea{x) ■ Eb{y) = Eb{y) ■ Ea{x). 
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Note that a more standard definition would be to replace the last condition on the measurement operators (com- 
mutativity) by the stronger condition that the operators Ea {x) act non-trivially on a subspace Ha only, and that the 
operators Ei, (y) act non-trivially on a subspace H b only, with H = Ha ®Hb- If we restrict the Hilbert space H to be 
finite-dimensional, these two definitions are equivalent, but whether this also holds in full generality is still unknown. 
We use this less standard definition because this will allow us to use the results from MNPA08II (see this reference for a 
discussion about the different definitions). 

We denote by Q the set of all quantum distributions. In the restricted case of binary outcomes with uniform 
marginals, we let Qo be the set of all quantum correlations. 

The communication complexity of simulating traceless binary measurements on maximally entangled states has 
been settled by Regev and Toner using two bits of communication, since in this case the marginals are uniform IIRT07I . 
Their technique also handles general binary measurements on any entangled state, but in this case they only simulate 
the correlations. The complexity of simulating the full joint distribution exactly when the marginals are non-uniform 
remains open. 

2.2 Models of communication complexity 

We consider the following model of communication complexity of non-signaling distributions p. Alice gets input x. 
Bob gets input y, and after exchanging bits or qubits, Alice has to output a and Bob b so that the joint distribution 
is p(a, h\x, y). i?o(p) denotes the communication complexity of simulating p exactly, using private randomness and 
classical communication. Qo{p) denotes the communication complexity of simulating p exactly, using quantum 
communication. We use superscripts "pub" and "ent" in the case where the players share random bits or quantum 
entanglement. For -Re(p), we are only required to simulate some distribution p' such that (5(p,p') < e, where 
(5(p, p') = max{|p(£|a;, y) — p'{S\x, y)\ : x, y S A" x 3^, £ C ^ x B} is the total variation distance (or statistical 
distance) between two distributions. 

For distributions with binary outcomes, we write Re{C, Ma, Mb) and Qe{C, Ma, Mb)- In the case of Boolean 
functions, Rc{C) — Re{C, 0, 0) corresponds to the usual notion of computing / with probability at least 1 — e, where C 
is the ±1 communication matrix of /. From the point of view of communication, distributions with uniform marginals 
are the easiest to simulate. Suppose we have a protocol that simulates correlations C with arbitrary marginals. By 
using just an additional shared random bit, both players can flip their outcome whenever the shared random bit is 1 . 
Since each players' marginal outcome is now an even coin flip, this protocol simulates the distribution (C, 0, 0). 

Proposition 2. For any Boolean non- signaling distribution (C, Ma, Mb), we have i?P"''(C, 0, 0) < i?P"''(C, Ma, Mb) 
and Q°^\C, 0,0) < Qf^\C, Ma, Mb). 

2.3 Characterization of the sets of local and non-signaling distributions 

In the quantum information literature, the distributions that can be simulated with shared randomness and no commu- 
nication (also called a local hidden variable model) are called local distributions. 

Definition 4. Local deterministic distributions are of the form p{a, b\x, y) ~ Sa=\j^(x) • Sfj^Xsiy) ^^^^"^ Xa ■ X ^ A 
and Xb ■ y ^ B, and 6 is the Kronecker delta. A distribution is local if it can be written as a convex combination of 
local deterministic distributions. 

We denote by A the set of local deterministic distributions {p'^^IagA and by C the set of local distributions. Let 
conv(A) denote the convex hull of A. In the case of binary outcomes, we have 

Propositions. C = con\/{{{u'^v,u,v) : u e {±1}'^ ,v e {il}-*^}). 

We also denote by Cq the set of local correlations over binary outcomes with uniform marginals. 

The quantum information literature reveals a great deal of insight into the structure of the classical, quantum, and 
non-signaling distributions. It is well known that £ and C are polytopes. While the extremal points of C are simply 
the local deterministic distributions, the non-signaling polytope C has a more complex structure |JM05||BP05| . Co is 
the convex hull of the distributions obtained from Boolean functions. 
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Proposition 4. Co = conv({(C/, 0, 0) : C/ G {±1}^''^}). 

We show that C is the affine hull of the local polytope (restricted to the positive orthant since all probabilities 
p{a, b\x, y) must be positive). We give a simple proof for the case of binary outcomes but this carries over to the 
general case. This was shown independently of us, on a few occasions in different communities ||RF8 1 1 |FR8 1 1 iKRF87l 
IWil92llB^07l . 

Theorem 5. C = aff^{i2}, where aff^{£} is the restriction to the positive orthant of the affine hull of C, and 
dimC = dim/: = \X\ x \y\ + \x\ + \y\. 

Proof. We show that afF(C) = aff(£). The theorem then follows by restricting to the positive orthant, and using the 
fact that C = afF+(C). 

[aff(£) C aff(C)] Since any local distribution satisfies the (linear) non-signaling constraints in Def.lH this is also 
true for any affine combination of local distributions. 

[afF(C) C aff(£)] For any (cr, tt) € X x y,wt define the distribution Po-tt = {Ca-K,UaTT,Va-Tr) with correlations 
Ca-Tr{x, y) — Sx=aSy=TT and marginals UaTr{x) = 0, Va-niy) = 0. Similarly, we define for any a E X the distribution 
Per. = {Ca--,Ua-.,Vcr.) with Ca--{x,y) — 0, Wcr. (x) = 5.j.,=a , Va- {v) — 0' '^^'^ Ti" G 3^ the distribution p.TT = 

(Ctt, ti.Tr, w.jr) with C.Tr{x,y) = Q,u.Tr{x) = 0,v.^{y) — Sy^-^. It is straightforward to check that these \X\ x \y\ + 
\X\ + \y \ distributions are local, and that they constitute a basis for the vector space embedding aff (C), which consists 
of vectors of the form (C, It, w). □ 

This implies that while local distributions are convex combinations of local deterministic distributions p^ G A, 
non-signaling distributions are affine combinations of these distributions. 

Corollary 6 (Affine model). A distribution pGC if and only if 3qx G K with p = X^agA 1>^P^- 

Note that since p is a distribution, this implies X^asA 9a = 1- Since weights in an affine combination may be neg- 
ative, but still sum up to one, this may be interpreted as a quasi-mixture of local distributions, some distributions being 
used with possibly "negative probability". Surprisingly this is not a new notion; see for example Groenewold [Gro85J 



who gave an affine model for quantum distributions; or a discussion of "negative probabiUty" by Feynman | Fey86| . 



2.4 Characterization of the set of quantum distributions 

As for the set of quantum distributions Q, it is known to be convex, but not a polytope. Although no simple 
characterization of Q is known, Navascues, Pironio and Acin have given a characterization for a hierarchy of sets 
{Q" : n G No}, such that Q" C Q"-^ and Q" -> Q forn ^- cx) llNRA08 l. We briefly introduce this hieraixhy 
because it will be useful in Section|4] but we refer the reader to MNPA08I for full details. 

Let Sn be the set of all monomials of degree up to n in measurement operators Ea{x) and Eb{y) (for example, 
1, Ea{x) and Ea{x)Ea' {x)Eb{y) are a monomials of degree 0, 1 and 3, respectively). Due to the conditions in Defi- 
nition [3] the operators in 5„ (and their Hermitian conjugates) satisfy linear equations such as Ea{x)^ — Ea{x) — 0, 
"Ylia-^o-i^) — 1 = 0, or higher order equations such as Ea{x)^ Ea{x)Eb{y) — Eb{y)^ Ea{x) = 0. Let us suppose 
that we have m{n) linearly independent equations for the operators in 5„. These equations may be written as 
Ssres (^fc)s,T'S'^T = 0, where, for all k G [m(n)], Fk is a matrix whose rows and columns are labelled by 
the elements of 5„. We are now ready to define the set of distributions Q". 

Definition 5 (Quantum hierarchy). A distribution p is in Q" if and only if there exists a positive-semidefinite matrix 
r )>= 0, whose rows and columns are labelled by the elements of Sn, satisfying 

1- ri,i = I, 

2- ^E^(x),Ei(y) = y), foralla,b,x,y e Ax B X X xy 
3. tr{FlT) = for all k G [m{n)]. 
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Comparing with Definition |3] for Q, we immediately get that Q C by setting Ts,t — {ip\S^T\^jj). The proof 
that Q" converges to Q is much more involved and is given in [NPAOS]. 

In the special case of binary outcomes with uniform marginals, the hierarchy collapses at the first level, that is, 
Qq = Qq- This was known before the hierarchy was introduced, as a consequence of the following theorem of 
Tsirelson. 

Theorem 7 ( 0Tsi85l ). Let S„ be the set of unit vectors in M", and TL'^ be a d-dimensional Hilbert space. 

1. If {C, Ma, Mb) G Q is a probability distribution obtained by performing binary measurements on a quantum 
state £ T-C^ ® TL'^, then there exists vectors a{x), b{y) g S2c;2 such that C{x, y) — a{x) ■ b{y). 

2. If a{x),b{y) are unit vectors in S„, then there exists a probability distribution (C, 0, 0) G Q obtained by 
performing binary measurements on a maximally entangled state G "HL^J ^ HLtJ such that C{x, y) = 
a{x) ■ b{y). 

Corollary 8. Qo = {C : C{x,y) = a{x) ■ b{y), ||a(x)|| = \\b{y)\\ = 1 Vx,?/}. 

Clearly, £ C Q C C. The existence of Grothendieck's constant (see e.g. IIAN06I ) implies the following statement. 
Proposition 9. >Co ^ Qo ^ Kg^O' where Kg is Grothendieck's constant. 

3 Lower bounds for non-signaling distributions 

We extend Linial and Shraibman's factorization norm (72) and nuclear norm (i^) lower bound methods fLS091 to 
the simulation of any non-signaling distributions. The proof we give is simple, especially in the setting studied by 
Linial and Shraibman, for Boolean functions, which corresponds in our setting to binary outputs and uniform marginal 
distributions. The main intuition is that c bits of communication can increase correlations by at most a factor of 2"^. 

3.1 Communication vs scaled-down distribution 

We first show that if a distribution p may be simulated with t bits of communication (or q qubits of quantum communi- 
cation), then a scaled-down version of this distribution is local (or quantum). From this local (or quantum) distribution, 
we derive an affine model for p (Theorem[T3]) which gives the lower bound on communication. 

Lemma 10. Let p be a non-signaling distribution over A x B with input set X x y. 

1. Assume that i?Q"'^(p) < t, then there exists two marginal distributions pA{a\x) and pB{b\y) such that the 
distribution pi{a,b\x,y) ~ ■^p{a,b\x,y) + (1 — ^)pA{a\x)pB{b\y) is local. 

2. Assume that (5™*(p) < q, then there exists two marginal distributions pA{a\x) and pB{b\y) such that the 
distribution pi{a,b\x,y) — ^p{a,b\x,y) + (1 — ^)pA{a\x)pB{b\y) is quantum. 

3. Assume that p = (C, 0, 0) and < Q' then C/2« G Qo- 

Proof. We assume that the length of the transcript is exactly t bits for each execution of the protocol, adding dummy 
bits if necessary. We now fix some notations. In the original protocol, the players pick a random string A and ex- 
change some communication whose transcript is denoted T{x, y, A). Alice then outputs some value a according to 
a probability distribution pp{a\x, A, T). Similarly, Bob outputs some value b according to a probability distribution 

Pp{b\y,X,T). 

From Alice's point of view, on input x and shared randomness A, only a subset of the set of all i-bit transcripts 
can be produced: the transcripts S G {0, 1}* for which there exists a y such that S = T{x, y, A). We will call these 
transcripts the set of vaUd transcripts for {x,\). The set of valid transcripts for Bob is defined similarly. We denote 
these sets respectively Ux.\ and Vy,\. 

We now define a local protocol for the distribution pi{a,b\x,y): 
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• As in the original protocol, Alice and Bob initially share some random string A. 

• Using additional shared randomness, Alice and Bob choose a transcript T uniformly at random in {0, 1}*. 

• If T is a valid transcript for {x, A), she outputs a according to the distribution pp{a\x, A, T). If it is not, Alice 
outputs a according to a distribution (a |x) which we will define later. 

• Bob does the same. We will also define the distribution later. 

Let ji be the distribution over the randomness and the t-h\i strings in the local protocol. By definition, the distribu- 
tion produced by this protocol is 



i{a,h\x,y) = ^m(A) 



XI IJ'{T)pp{a\x,\,T)pp{b\y,X,T)+pB{h\y) ^ ii{T)pp{a\x,X,T) 



+ pA{a\x) J2 KT)ppib\y,X,T)+pB{b\y)pA{a\x) ^ fiiT) 

We now analyze each term separately. For fixed inputs x, y and shared randomness A, there is only one transcript 
which is vaUd for both Alice and Bob, and when they use this transcript for each A, they output according to the 
distribution p. Therefore, we have 

^ i^{T)pp{a\x, X, T)pp{b\y, X, T) = ^p{a, b\x, y). 

Let Ax be the event that Alice's transcript is vaUd for x (over random A, T), and its negation (similarly By and 
By for Bob). We denote 



pp{a\x,Ax nBy) = 



^ Ea T,Teu.,xnVy,x KT)pp{a\x, X, T) 

ll{Aa:f^By) 



where, by definition, we have iJb{Ax fl By) = J2x m(^) Etgc/^ xny x '^(■^)' show that this distribution is 

independent of y and that the corresponding distribution pp{b\y, A^ fl By) for Bob is independent of x. Using these 
distributions, we may write Pi{a, b\x, y) as 

pi{a,b\x,y) = ^p{a,b\x,y) + n{Axr\By)pB{b\y)pp{a\x,A:^r\By) 

+ M(^a: n By)pA{a\x)pp{b\x,Ax n By) + ii{Ax n By)pB{b\y)pA{a\x) 
Summing over b, and using the fact that and p are non-signaling, we have 

= ^p{a\x)+ ^l{A^r\By)pp{a\x,A^f^By) 

+ iJ.{Ax n By)pA{a\x) -h n{Ax n By)pA{a\x) 

= ^p{a\x) + iJ.{Aa; n By)pp{a\x, A^ n By) + ij,{Aa;)pA{a\x), 

Note that by definition, = /x(A) ^j^^^^ ^ /x(T) is independentof y, therefore so is /^(^xn-Bj/) = ^,{Ax) — 
Ijl{Ax nBy) = iJ.{Ax) — ^. From the expression for pi{a\x), we can conclude that pp(a|.'j;, A^ fl By) is independent 
of y and can be evaluated by Alice (and similarly for the analogue distribution for Bob). We now set 

PA{a\x) = pp{a\x,Ax r\ By) 
PB{b\y) = pp{b\y,AxnBy). 
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Therefore, the final distribution obtained from the local protocol may be written as 

pi{a,b\x,y) = -^p{a,b\x,y) + ^i{A^r\By)pAia\x)pBib\y) 

+ i.i{A.j; n By)pA{a\x)pB(b\y) + //(A^ n By)pA{a\x)pB{b\y) 
= ^P{ab\xy) + {1 - ^^)pA{a\x)pB{b\y). 

For quantum protocols, we first simulate quantum communication using shared entanglement and teleportation, 
which uses 2 bits of classical communication for each qubit. Starting with this protocol using 2q bits of classical 
communication, we may use the same idea as in the classical case, that is choosing a random 2q-bit string interpreted 
as the transcript, and replacing the players' respective outputs by independent random outputs chosen according to pA 
and if the random transcript does not match the bits they would have sent in the original protocol. 

In the case of binary outputs with uniform marginals, that is, p = (C, 0, 0), we may improve the exponent of the 
scaling-down coefficient 2'^'? by a factor of 2 using a more involved analysis and a variation of a result by IIKre95l 
|Yao93l|LS09i (the proof is given in AppendixlAlfor completeness). 

Lemma 11 ( 0Kre95l I Yao93l ILS09I ). Let (C, Ma, Mb) be a distribution simulated by a quantum protocol with shared 
entanglement using qa qubits of communication from Alice to Bob and qs qubits from Bob to Alice. There exist vectors 
d{x),b{y) with \d{x)\ < 2«s and \\b{y)\\ < 2'^'* such that C{x, y) = d{x) ■ b{y). 

The fact that C/^t e Qo then follows from TheoremElpail 2. □ 

3.2 Communication vs affine models 

By Theorem|5] we know that any non-signaling distribution can be written as an affine combination of local distri- 
butions, which we call affine model. In this section we show that using Lemma [TOl an explicit affine model can be 
derived from a (classical or quantum) communication protocol for p, which gives us a lower bound technique for 
communication complexity in terms of how "good" the affine model is. 

Let us define the following quantities, which as we will see may be considered as extensions of the v and 72 
quantities of OLS09I (defined below) to distributions. 

Definition 6. • i>(p) = minlX^i | |: 3pi G £, (?i e M, p = Y.i ^iPi}. 

• 72 (p) = min{X;, I |: 3p, G Q, e R, p = *PJ. 

• z><^(p) = min{i>(p') : (5(p,p') < e}, 

• 72(P) = min{72(p') : (5(p,p') < e}. 

The quantities i^(p) and 72 (p) show how well p may be represented as an affine combination of local or quantum 
distributions, a good affine combination being one where the sum of absolute values of coefficients qi is as low as 
possible. For a local distribution, we may take positive coefficients qi, and therefore obtain the minimum possible 
value i>(p) = 1 (note that g^pi — p implies in particular qi = 1), and similarly for quantum distributions, so 
that 

Lemma 12. p G £ i>(p) = 1, and p G Q 72 (p) = 1- 

In other words, the set of local distributions C form the unit sphere of i), and similarly the set of quantum distri- 
butions Q form the unit sphere of 72. In the binary case, observe that by Proposition]!] we have 72 (C*) < 72 (C, u, v) 
and v{C) < i>{C,u,v). By Proposition|9] 72(C) < t'(C) < KgJ2{C). Similar properties hold for the approximate 
versions i^'^{C) and 7|(C). 

We have shown (Lemma [TO]i that distributions scaled down exponentially in the communication are local; from 
these local protocols we can build up an affine model for the original distribution, in order to establish the lower bound. 

Theorem 13. Let p be a non-signaling distribution over A x B with input set X x y, and C : X xy ^ [—1,1] be a 
correlation matrix. 

L //i?P"^(p) < t, then < 2*+i - 1. 
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2. //i?P"''(C) < t, then v{C) < 2*. 

3. IfQT\p) < 1' then j2{p) < 2^"}+^ - 1. 

4. IfQ^^^C) < q, then 72(C) < 2i. 

Proof. We give a proof for the classical case, the quantum case follows by using teleportation. Let c be the number 
of bits exchanged. From Lemma [TOl we know that there exists marginal distributions pA{a\x) and pB{b\y) such that 
pi{a,b\x,y) — ^p{a,b\x,y) + (1 — ■^)pA{o,\x)pB{b\y) is local. This gives an affine model for p(a, y), as the 
following combination of two local distributions: 

p{a, b\x, y) = 2*p,(a, b\x, y) + {1 - 2*)pAia\x)pB{b\y). 

ThenD{p) < 2*+^ - 1. 

In the case of binary outputs with uniform marginals, p; — (C/2*, 0, 0), and Lemma [TOl implies that C/2* G Cq. 
By following the local protocol for C/2* and letting Alice flip her output, we also get a local protocol for —C/2*, so 
—C/2* G Co as well. Notice that we may build an affine model for C as a combination of C/2* and —C/2*: 

C = i(2* + l)^-i(2*-l)^. 
2^ '2* 2^ ^2* 

Then, i>(C) < 2*. □ 

This implies the following lower bounds on classical and quantum communication complexity: 
Corollary 14. For any non-signaling distribution p and correlation matrix C, 

1. i?P"^P) > log(i^(p)) - 1, andRr'^ip) > log(i>'(p)) - 1. 

2. QT'iP) > ilog(72(p)) - 1, flWfl'QrHp) > 5log(72(p)) - 1- 

3. QT'iC) > log(72(C)), andQr\C) > log(7|(C)). 

3.3 Factorization norm and related measures 

In the special case of distributions over binary variables with uniform marginals, the quantities D and 72 become 
equivalent to the original quantities defined in ILMSS07. ,LS09J (at least for the interesting case of non-local correla- 
tions, that is correlations with non-zero communication complexity). When the marginals are uniform we omit them 
and write i'(C) and 72(C). The following are reformulations as Minkowski functionals of the definitions appearing 
in llLMSS07llLS09i . 

Definition 7. • z/(C) = min{A > : ^C e Cq}, 

• 72(C) =min{A>0: ^C G Qo}, 

• jy"(C) = mm{iy{C') : 1 < C{x, y)C'{x, y) <a,Wx,y e X x y}, 

• 72" (C) = min{72(C') : 1 < C{x,y)C' {x,y) <a,yx,yeXx y}. 
Lemma 15. For any correlation matrix C ; A" x 3^ — > [— 1, 1], 

7. u{C) = 1 iffi^iC) < 1, andUC) = 1 #72 (C) < 1, 

2. i>{C) > 1 =^ i^{C) = v{C), 

3. 72(C) > 1 =^ 72(C) = 72(C). 
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Proof. The first item follows by definition of v and 72. For the next items, we give the proof for v, and the proof for 
72 is similar. The key to the proof is that if C G Cq, then —C € Co (it suffices for one of the players to flip his output). 

[i>{C) < v{C)] If i>(C) > 1, then A iy{C) > 1. Let C+ = ^ and By definition of j/(C), both C+ 

and are in Cq. Furthermore, let = > and q_ = < 0. Since C = q+C^ + q^C^ , this determines 
an affine model for C with | + |(7_ | = A. 

[t>(C) > v{C)] Let A = v{C). By definition of v{C), there exists Ci and qi such that C = J^i ^li^i ^i^^ 
^ = I]i Let Ci = sgn{qi)Ci andp^ = -1^. Then, j- = J2iP^C^i and therefore j^C G Co since £ Co- □ 

In the special case of sign matrices (corresponding to Boolean functions, as shown above), we also have the 
following correspondence between i'^ , 7|, and , 72 . 

Lemma 16. Let < e < 1/2 and a — j^^- For any sign matrix C : X x y ^ { — 1,1}, 

1. i>'{C) > 1 =^ y''{C) - 

2. 7|(C)>l=^72"(^^) = f^- 

Proof. We give the proof for i^", the proof for 72 is similar 

[i^"(C) < -Yr^] I^y definition of v'^{C), there exists a correlation matrix C" such that v{C') ~ v'^{C) and 
\C{x,y)—C' {x,y)\ < 2e for all a;,?/ G Xxy. Since C is a sign matrix, and C" is acorrelationmatrix, sgn(C"(a;, j/)) = 
C{x,y) and 1 - 2e < \C'{x,y)\ < 1. Hence 1 < C{x,y) '^['f^f < = a This implies that i^" (C) < i^ij^J = 
= where we used the fact that iy{C') = i>(C") since i>(C") > 1. 

[i^"(C) > By definition of iy°'{C), there exists a (not necessarily correlation) matrix C such that j^(C") = 

z^"(C) and 1 < C(a;, y)C'{x, y) < a for all x, y. Since C is a sign matrix, this implies sgn(C"(a;, y)) — C{x, y) and 
l-2e < 1 *^'^;^'^^ < 1. Therefore, \C {x , y) - ^^-^\ < 2e for all a;, y. This implies that (C) < = = 

(1 - 2e)iy(C"), where we have used the fact that = '^i^) since i'(^) > i'^(C) > 1. □ 

Just as the special case i^{C), £'(p) may be expressed as a linear program. However, while 72 (C*) could be 
expressed as a semidefinite program, this may not be true in general for 72 (p)- Nevertheless, using the hierarchy 
{Q" : n G No} introduced in |NPA08 1, it admits SDP relaxations {72 (p) : n G No}. 

Definition 8. 72"(p) = min{^,^ | q, |: 3p, G Q", <z. G M, p = E» *PJ- 

The fact that Q" C Q"^^ implies 72 (p) > 72 ^^(p)^ and by continuity of the minimization function, 72 (p) 
72 (p) for n 00. 

Lemmas [TS] and [T6] establish that Corollary [14] is a generalization of Linial and Shraibman's factorization norm 
lower bound technique. Note that Linial and Shraibman use 73 to derive a lower bound not only on the quantum 
communication complexity Q™*, but also on the classical complexity i??"''. In the case of binary outcomes with 
uniform marginals (which includes Boolean functions, studied by Linial and Shraibman, as a special case), we obtain 
a similar result by combining our bound for QT\C) with the fact that Q™*(C) < [ii?P"^(C)], which follows from 
superdense coding. This implies Rp^^{C) > 2 log(7| (C)) — 1. In the general case, however, we can only prove that 
jjpub^p-j > log(7|(p)) — 1. This may be due to the fact that the result holds in the much more general setting of 
non-signaling distributions with arbitrary outcomes and marginals. 

Because of Proposition |9] we know that i^iC) < Kcj2{C) for correlations. Note also that although 72 and i' are 
matrix norms, this fails to be the case for 72 and D, even in the case of correlations. Nevertheless, it is still possible to 
formulate dual quantities, which turn out to have sufficient structure, as we show in the next section. 

4 Duality, Bell inequalities, and XOR games 

In their primal formulation, the 72 and i> methods are difficult to apply since they are formulated as a minimization 
problem. Transposing to the dual space not only turns the method into a maximization problem; it also has a very 
natural, well-understood interpretation since it coincides with maximal violations of Bell and Tsirelson inequalities. 
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This is particularly relevant to physics, since it formalizes in very precise terms the intuition that distributions with 
large Bell inequality violations should require more communication to simulate. 

Recall that for any norm || ■ || on a vector space V, the dual norm is = max„gy.||„||<i B{v), where 5 is a 
linear functional on V. 

4.1 Bell and Tsirelson inequalities 

Bell inequalities were first introduced by Bell BBel64 l. as bounds on the correlations that could be achieved by any 
local physical theory. He showed that quantum correlations could violate these inequalities and therefore exhibited 
non-locality. Tsirelson later proved that quantum correlations should also respect some bound (known as the Tsirelson 
bound), giving a first example of a "Tsirelson-like" inequality for quantum distributions |Tsi80|. 

Since the set of non-signaling distributions C lies in an affine space afF(C), we may consider the isomorphic dual 
space of linear functionals over this space. The dual quantity i>* (technically not a dual norm since P itself is not a 
norm in the general case) is the maximum value of a linear functional in the dual space on local distributions, and 72 
is the maximum value of a linear functional on quantum distributions. These are exactly what is captured by the Bell 
and Tsirelson inequalities. 

Definition 9 (Bell and Tsirelson inequalities). Let B : afF(C) ^ M.be a linear functional on the (affine hull of the) 
set of non-signaling distributions, B{p) = ^ ^ BabxyPio-, b\x, y). Define v* (B) — maxpg^ B{p) and 72 (B) — 
maXpgQ B{p). A Bell inequality is a linear inequality satisfied by any local distribution, B{p) < i)*{B) (V p G C), 
and a Tsirelson inequality is a linear inequality satisfied by any quantum distribution, B{p) < 72 (B) (V p G Q). 

By linearity (Proposition [T]! Bell inequalities are often expressed as linear functionals over the correlations in the 
case of binary outputs and uniform marginals. 

Finally, 72 and i) amount to finding a maximum violation of a (normalized) Bell or Tsirelson inequality. 

Tlieorem 17. For any distribution p G C, 

7. £>(p) = max{B(p) : Vp' G C, \B{p')\ < 1}, and 

2. 72(p) = max{B(p) : Vp' G Q, |B(p')| < 1}, 
where the maximization is over linear functionals B : aff (C) i— S- K. 
Proof. 1. This follows by LP duality from the definition of i>. 

2. We use the SDP relaxation 72 (p), which may be expressed as 

72 (P) = min{(7+ + g_ : 3p+,p_ G Q",q+,g_ > 0,p = q+p+ - (?-P-}, 

and define 

/3"(p) = max{i?(p) : Vp' G Q", |B(p')| < 1}. 

We now show that /3"(p) = 72 (p)> which proves our statement by taking the limit n ^ 00. 
[/3"(p) < 72 (p)] Let 72 (p) = q+ + where (?+,<?- > and p = q+p+ - q^P- for some p+,p_ G Q". 
Similarly, let /3"(p) = S(p), where |B(p')| < 1 for all p' G Q". It then follows that 

B(p) =g+B(p+)-g_B(p_) <9+|S(p+)|+9_|S(p_)| <g++g_. 

[/3"(p) > 72 (p)] In order to use SDP duality, we first express 72 (p) in standard SDP form. Using the definition 

of Q", 

72"(p)=niinr+,+r-^i 
subject to r+,r" )p 0, 

tr(Ffc^r+) = tr(Ffe^r-) = V/s G [m(n)]. 
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The dual SDP then reads 



(5"(p)==max ^ BabxyP{a,b\x,y) 

a.h,x,y 

SUbjeCttO ^-bxvTE^(x).E,(y)>-[^l.l+ ^^^M^fcH] V T ^ 0, 

a,b,x,y fe£[-m(n)] 

BabxyTE.ix),E,(y)<Tl,l+ ^ S^r(F,!r) V T ^ 0. 
o,,b,x,y k£[m{n)] 

It may be shown that the dual is strictly feasible, so that strong duality holds and (5"(p) = 72 (p) (see IIVB96I ). 
Together with the definition of Q", this shows that a feasible solution for (5"(p) implies a feasible solution for /3"(p), 
sothat^"(p) > 5"(p). □ 

4.2 XOR games 

In this section, we consider distributions over binary variables with uniform marginals, p = (C, 0, 0), and furthermore 
restrict to the case of sign matrices C E {±1}'^^-^. As we have seen before, this corresponds to the standard 
framework of communication complexity of Boolean functions, and we have i>(C, 0, 0) = t^{C)- We show a close 
relation between i^{C), XOR games and Bell inequalities. 

In an XOR game, Alice is given some input x and Bob is given an input y, and they should output a = ±1 
and b = ±1. They win if a ■ 6 equals some ±1 function G{x,y). Since they are not allowed to communicate, 
their strategy may be represented as a local correlation matrix S E Cq. We consider the distributional version of 
this game, where /i is a distribution on the inputs. The winning bias given some strategy S with respect to /x is 
e^{G\\S) — J2x y y)G{x, y)S{x, y), and eP^^{G) = m&xseCo ^^('-'11'^) '^^e maximum winning bias of any 
local (classical) strategy. (For convenience, we consider the bias instead of game value uj^^^{G) = (1 + e^^^{G))/2.) 
Define e™*(G') similarly for quantum strategies. When the input distribution is not fixed, we define the game biases as 
eP"b(G) ^ j^jj^^ eP"'^(G) and e™*(G) = min^ e™'(G). 

Lemma 18. There is a bijection between XOR games (G, /i) and normalized correlation Bell inequalities. 

Proof. An XOR game (G, /x) determines a linear functional Go/i (G) = e^(G||G) on the set of correlation matrices, 
where o is the Hadamard (entrywise) product. By Definition^ v*{Go^) = eP"'^(G), and e^,(G||G) < eJ^^iG) is a 
Bell inequality satisfied by any local correlation matrix G. Similarly, when the players are allowed to use entanglement, 
we get a Tsirelson inequality on quantum correlations, e^(G||G) < e™*(G) (the quantum bias is also equivalent to a 
dual norm e™*(G) = 7^ (Go^)). 

Conversely, consider a general linear functional B{G) — y BxyG{x, y) on aff (Co), defining a correlation Bell 
inequahty B{C) < v* {B) V G G £o- Dividing this Bell inequahty by = ^ \Bxy\, we see that it determines an 

I B I 

XOR game specified by a sign matrix G{x, y) = sgTi{Bxy) and an input distribution ^i^y = , and having a game 
biaseP"i^(G) = □ 

By TheoremfTTland the previous bijection (see also Lee et al. IILSvOSI ): 
Corollary 19. 1. v{G) = max^ ^ ^ pib'ff?'? where the maximum is over XOR games (G, /x). 

2. i^{G) > 

The second part follows by letting G = G. Even though playing correlations G for a game G — C allows us to 
win with probability one, there are cases where some other game G ^ C yields a larger ratio. In these cases, we have 
> ^pubf^Q-^ SO that ly gives a stronger lower bound for communication complexity than the game value (which 
has been shown to be equivalent to the discrepancy method LLSvOSJ ). 
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We can characterize when the inequality is tight. Let e5L"'^(C) = maxsg£o{/3 : '^x,y,C{x,y)S{x,y)=P}, that 
is, we only consider strategies that wins the game with equal bias with respect to all distributions. For the sake of 
comparison, the game bias may also be expressed as ||von2 8 |: 



eP"i^(C) = max{/3 : Vx, C(a;, y)S'(a;, y)>/3} = max min C(x, ?;)S'(a;, y). 

SeCo SeCa x,y 



Lemma 20. i/(C) 



We can also relate the game value to v°'{C), as it was shown in ILSvOSI that for a — ?► oo, v°°{C) is exactly the 
inverse of the game bias ^pub^-p-^ . We show that this holds as soon as a = is large enough for C to be local up to 
an error e, completing the picture given in Lemma [T6] 

Lemma 21. Let < e < 1/2 and a ~ For any sign matrix C : X x y ^ { — 1, 1}, 

7. D^iC) = 1 ^ e > 1 - LoP-HC) <=^a> ^ = u°-{C) = 

2. %{C) = 1 ^ e > 1 - Lo^^\C) ^a> ^ ^ j^C) = l^iC) = 
Proof. By von Neumann's minmax principle llvon28l . 

gPub/(^N ^ maxminC(a;,y)S'(a;,y) 

SeCo x.y 

= max min 1 — |C(a;, y) — ?/)| 

SeCa x,y 

where we used the fact that C is a sign matrix. This implies that v'^ (C) — 1 ^ e > ^ ^ ' <^ a > ^p„b(g) ■ 

By Lemma [T6l this in turn implies that i^"(C) — for ^ < ^~^~'T^^- continuity, taking the limit 

e ^ ^"'"^^'^^ yields v°'{C) = ^^ri^^ for a = ^^2(c) ■ From ILSvOSI . v°°{C) = ^p„b\c-) , and the lemma follows 
by the monotonicity of u°' (C) as a function of a. □ 



5 Comparing 72 and i> 

It is known that because of Grothendieck's inequality, 72 and v differ by at most a constant. Although neither of these 
hold beyond the Boolean setting with uniform marginals, we show in this section that this surprisingly also extends to 
non-signaling distributions. 

Theorem 22. For any distribution p S C, with inputs in X x y and outcomes in Ax B with A = \A\, B = \B\, 

1. i>(p) < {2Kg + 1)72 (p) when A = B = 2, 

2. £>(p) < [2AB{Kg + 1) - l]j2ip) for any A, B. 

The negative consequence of this is that one cannot hope to prove separations between classical and quantum 
communication using this method, except in the case where the number of outcomes is large. For binary outcomes at 
least, this says that arguments based on analysing the distance to the quantum set only, without taking into account 
the particular structure of the distribution, will not suffice to prove large separations; and other techniques, such as 
information theoretic arguments, may be necessary. 

For example. Brassard et al. [BCT99 1 give a (promise) distribution based on the Deutsch-Jozsa problem, which can 
be obtained exactly with entanglement and no communication, but which requires Unear communication to simulate 
exactly. The lower bound is proven using a corruption bound II BCW98I . which is closely related to the information 
theoretic subdistribution bound LJKN08 I. For this problem, X = y = {0, 1}" andyl ^ B ~ [n], therefore our method 
can only prove a lower bound logarithmic in n. This is the first example of a problem for which the corruption bound 
gives an exponentially better lower bound than the Linial and Shraibman family of methods. 
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On the positive side, this is very interesting for quantum information, since (by TheoremfTTb. it tells us that the set 
of quantum distributions cannot be much larger than the local polytope, for any number of inputs and outcomes. For 
binary correlations, this follows from the theorems of Tsirelson (Theorem|7]l and Grothendieck (Proposition|9]), but no 
extensions are known for these results in the more general setting. 

The proof will use two rather straightforward lemmas. 

Lemma 23. Ifp = 9«P«' where Pi G C and qi G M for all i G [/], then i>(p) < J2ie[i] l9il^(Pi)- 

Proof. By definition, for each p;, there exists pf,pY G C and q^,q^ > such that pi = q'^pf — q^^Pi, and 
it + <k = ^'(pO- Therefore, p = Y.;e[i] niltpt - I^Pi) and Eie[/] (k'*9i^ I + luirl) = T,i hlilt + O = 

Lemma 24. Let p,p' G C be non-signaling distributions with inputs in X x y for both distributions, outcomes 
in A X B for p, and outcomes in A' x B' for p', such that A ^ A! and B Q B' . If for any {a,b) G A y. B 
p'{a, b\x, y) = p(a, h\x, y), then v{p') = i'(p)- 

Proof. Let £ = {A' x B') \ x B). First, note that since p'(a, y) — p{a, b\x, y) for any (a, b) A x B, we 
have, by normalization of p, p'{a, b\x, y) — for any (a, 6) G £. 

[i^(p') ^ '^{p)] Let p = q+p^ — q-P^ be an affine model for p. Obviously, this implies an affine model for p' 
by extending the local distributions p^, p^ from A x B to A' x B', by setting p^{a, b\x, y) = p~^{a, b\x, y) = for 
any (a, b) G £, so i>(p') < i'(p). 

[t'(p') > i^(p)] Let p' = q+p'^ — q-p'^ be an affine model for p'. We may not immediately derive an affine 
model for p since it could be the case that p'^{a, b\x, y) or p'^ (a, b\x, y) is non zero for some (a, b) G £. However, 
we have q^p'^{a, b\x, y) — q-p'^ (a, b\x, y) = p'{a, b\x, y) = for any (a, b) G £, so we may define an affine model 
P = Q+P^ ^ Q-P^ where p+ and p~ are distributions on ^ x B such that 

p+{a, b\x, y) = p'+(a, b\x,y) + ^Yl ^''^("'' ^1^' v) + -^Yl P'^^""^ ^'1^' + ^ Yl ^'1^' 2^)' 

and similarly for p^. These are local since it suffices for Alice and Bob to use the local protocol for p'+ or p'^ and 
for Alice to replace any output a ^ Ahy a. uniformly random output a' ^ A (similarly for Bob). Therefore, we also 
have v{p') > i^{p). □ 

Before proving Theorem|22] we first consider the special case of quantum distributions, such that 72 (p) = 1. As 
we shall see in Section |6] this special case implies the constant upper bound of Shi and Zhu on approximating any 
quantum distribution |SZ08 1, which they prove using diamond norms. This also immediately gives an upper bound on 
maximum Bell inequality violations for quantum distributions, by Theorem[T7j which may be of independent interest 
in quantum information theory. 

Proposition 25. For any quantum distribution p G Q, with inputs in X x y and outcomes in A x B with A = 

1^1, B = \B\, 

1. i){p) < 2Kg + 1 when A = B = 2, 

2. £>(p) < 2AB{Kg + 1) - lforanyA,B. 

Proof. 1. Since A = B ~ 2, we may write the distribution as correlations and marginals, p = {C, Ma, Mb). 
Since (C, Ma, Mb) G Q, we also have (C, 0, 0) G Q, and by Tsirelson's theorem, {C/Kq, 0, 0) G C. More- 
over, it is immediate that {MaMb, Ma, Mb), [MaMb, 0, 0) and (0, 0, 0) are local distributions as well, so that 
we have the following affine model for (C, Ma, Mb) 

{C, Ma, Mb) = Kg{C/Kg, 0, 0) + {MaMb, Ma, Mb) - {MaMb, 0, 0) - {Kg - 1)(0, 0, 0). 

This implies that i>(C, Ma, Mb) < 2Kg + 1. 
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2. For the general case, we will reduce to the binary case. Let us introduce an additional output 0, and set 
A' = AU {0} and B' = BU {0}. We first extend the distiibution p to a disti'ibution p' on A' x B' by setting 
p'{a, b\x, y) — p{a, b\x, y) for any (a, b) & A y. B, and p'(a, b\x, y) = otherwise. By Lemma l24l we have 
j>(p) = i>{p'). 

For each {a, f3) ^ A x B,we also define a probability distribution Pap on A' x B': 



Pai3{a,b\x,y) 



p{a,l3\x,y) if (a,6) = (a,^), 

p{a\x) - p{a, I3\x, y) if (a, b) = (a, 0), 

if (a, 6) = (0, 
1 — — p{P\y) + p(q;, /3|a;, y) if (a, 6) = (0,0), 

otherwise. 



Notice that G Q, since a protocol for pap can be obtained from a protocol for p: Alice outputs whenever 
her outcome is not a, similarly for Bob. Let Aa — {a, 0} and Bp — 0}. Since Pa/3{a, b\x, y) = when 
(a, 6) ^ X Bp, we may define distiibutions p^^ on x Bp such thatp^^(a, &|a;, y) ~ Pap{a, b\x, y) for all 
(a, 6) G X Bp. By Lemmal24l these are such that I'ip'ap) — i^iPap), and since these are binary distributions, 
'^(p'ap) — S-f^G + l- Let US define three distributions Pa, Pb I P0 on x i3' as follows. We let pA(a, 0|a;, y) = 
p{a\x),p-B{0,b\x,y) = p{b\y), and everywhere else; and ^0(0, y) = 1 if (a, 6) = (0,0), and 
otherwise. These are product distributions, so pa, Pb, P0 G ^ and iy = 1 for all three distributions. 

We may now build the following affine model for p' 

P' = E P"/3 ' (^-1)PA - (^-1)PB - {AB-A-B+l)p^, 

{a,P)eAxB 

From Lemma |23l we conclude that i>(p') < AB{2Kc + 2) - 1 

□ 

The proof of Theoreml22limmediatelv follows. 

Proof of Theorem\22\ By definition of 72(p), there exists p+, p^ G Q and q+,q^ > such that p = q+p^ — q P 
and q+ + q^ = 72(p)- From Lemmal23l i'ip) < q+i'{p'^) + 9-?'(p ), and Proposition |25] immediate Iv concludes 
the proof. □ 



6 Upper bounds for non-signaling distributions 

We have seen that if a distribution can be simulated using t bits of communication, then it may be represented by an 
affine model with coefficients exponential in t (Theorem [13]). In this section, we consider the converse: how much 
communication is sufficient to simulate a distribution, given an affine model? This approach allows us to show that any 
(shared randomness or entanglement-assisted) communication protocol can be simulated with simultaneous messages, 
with an exponential cost to the simulation, which was previously known only in the case of Boolean functions [Yao03l 
ISZ08llGkd06l . Our results imply for example that for any quantum distribution p G Q, Q|(p) = 0(log(n)), where n 
is the input size. This in effect replaces arbitrary entanglement in the state being measured, with logarithmic quantum 
communication (using no additional resources such as shared randomness). We use the superscript || to indicate 
the simultaneous messages model, where Alice and Bob each send a message to the referee, who without knowing 
the inputs, outputs the value of the function, or more generally, outputs a, b with the correct probability distribution 
conditioned on the inputs x, y. 

Theorem 26. For any distribution p G C with inputs in X x y with \X x y\ < 2", and outcomes in A x B with 
A ^ \A\,B = \B\, andanye,d < 1/2, 

1. i?!;T'(p) < 16 [^^I'ln [44^] log(AS), 
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2. Ql+siP) < O ^{ABf 

The proof relies on Hoeffding's inequality MMcD9 1 1 . 

Proposition 27 (Hoeffding's inequality). Let X be a random variable with values in [a, b]. Let Xt be the t-th ofT 
independent trials of X, and S = X]t=i -^t- 

Then, Pt[S ~ E{X) > P] < e , and Vt:[E{X) ~ S > [3] < e , far any (3 > 0. 

We will also use the following lemma. 

Lemma 28. Let p be a probability distribution on V with V = \V\, and e : — M^. For each w 6 V, let Qy be a 
random variable such that V/3 > 0, Pr[(3t, > p{v) + /3] < e(/3) and PvlQy < p{v) — /3] < e(/3). 

Then, given samples {Qy : v £ V}, and without knowing p, we may simulate a probability distribution p' such 
thatS{p',p) < 2F[/3 + e(/3)]. 

Proof. In order to use the variables Qy as estimations foTp{v), we must first make them positive, and then renormalize 
them so that they sum up to 1. Let Ry = max{0, Qy}. Then we may easily verify that 

FT[Ry > p{v) + < e(/3), 
Pr[Ry < p{v) ~ P] < e(/3). 

For any subset £ CV of size E = \£\, we also define the estimates Rg — J2ve£ ^'^^ P{^)- summing, 

Vy[R£>p{£)+EI3] < Ee{(3), 
Pr[i?£ < P{£) - EP] < EeiP). 

In order to renormalize the estimated probabilities, let i?v = Si,ev ^' fina\ estimates 

Sy — Ry/R\>. On the other hand, if i?v < 1, we keep 5*^, = Ry and introduce a dummy output ^ V with estimated 
probability S0 = 1 — R\> (we extend the original distribution to V U {0}, setting p{0) ~ 0). By outputting v with 
probability Sy, we then simulate some distribution p'{v) — E{Sy), and it suffices to show that \E{S£) — p{£)\ < 
2V[P + e(/3)] for any f C V U {0}. 

We first upper bound E{S£) for f G V. Since < Rg, we obtain from the bounds on Rg that Pr[5£ > 
p{£) + EP] < Ee{P). Therefore, we have Ss < p{£) + E(3 with probabiHty at least 1 - Ee{l3), and 5^ < 1 with 
probability at most Ee{(3). This implies that E{S£) < p{£) +E[I3 + e(/3)]. 

To lower bound E{S£), we note that with probability at least 1 — Ee{f3), we have Rg > p{£) — E(3, and with 
probability at least 1 — Ve{l3), we have i?v < 1 + V/S. Therefore, with probability at least 1 — {E + V)e{f3), both these 
events happen at the same time, so that Ss = Rs/Rv > {p{£) ~ El3){l - V(3) > p{£) - {E + V)f3. This implies 
that EiSs) > p(£) -{E + V)[I3 + e(/3)]. Since ^0 = 1 - Sv, this also implies that E{S0) <2V[(i + e{(3)]. □ 

Proof of Theorem\26\ 1. Let A = i>(p), p = q+p^ — Q-P , with g+, g_ > 0, q+ + Q- = A and p+, p^ G C Let 
P+, be protocols for p+ and p^, respectively. These protocols use shared randomness but no communication. 

To simulate p, Alice and Bob make T independent runs of P+, where we label the outcome of the t-th run 
(a^, b^ ). Similarly, let (a^", b^ ) be the outcome of the t-th run of . They send the list of outcomes to the referee. 

The idea is for the referee to estimate p{a, b\x, y) based on the 2T samples, and output according to the estimated 
distribution. Let P^^ be an indicator variable which equals 1 if a^^ a and b'^ = b, and otherwise. Define Pf^ 
similarly. Furthermore, let Pt^a,b = q+P^a,b - 1-Pt^a,b- Then E{Pt,a,b) = p(a, b\x, y) and Pt^a,b e [-<?_, 9+]. 

Let Pa,b = ^ Ym-=i Pt,a,b be the referee's estimate forp(a, b\x, y). By Hoeffding's inequahty, 

Pr[Pa,b >p(a,&|x,y) + /3] 
VY[Pa,b<p{aM^,y)- P] 



21M 
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_ 2T/3f_ 

Lemma |28l with V = A x B, Qa,b ~ Pa.b and e(/3) = e then implies that the referee may simulate a 



probability distribution p' such that 5{p',p) < 2AB{P + e ). It then suffices to set /3 = jjg, and T 



8 [^^-f^] ^ In [^4^] '■^ conclude the proof, since Alice sends 2T log A and Bob sends 2T log B bits to the referee. 
For apply this proof to the distribution p" with statistical distance S{p, p") < e and (^(p") = i''^{p)- 
Note that the same proof gives an upper bound on in terms of 72. 

2. If shared randomness is not available but quantum messages are, then we can use quantum fingerprinting IBCWdOll 
IYao03l to send the results of the repeated protocol to the referee. Let (a+(r), b^{r)) be the outcomes of P+ using r 
as shared randomness. We use the random variable j4+(r) as an indicator variable for a+(r) = a; similarly B^, and 

— J2{a,b)e£ "^t^b ■ 

We can easily adapt the proof of Newman's Theorem fNew911, to show that there exists a set of L random strings 
TZ = {ri, . ■ .r^} such that Vx, y, | Er^eniPg (fi)) ~ -E'(P^) |< a provided L > where n is the input length, and 

is the random variable where randomness is taken from TZ. In other words, by taking the randomness from TZ, we 
may simulate a probability distribution p+ such that (5(p+, p+) < a. 

For each a,b G A x B, AHce and Bob send T copies of the states = Ei<i<L l^a and 

[0+) = -^J2i<i<L to the referee. The inner product is 

{^M) = J E {Atirm{l\B+{n))=p+ia,b\x,y), 

l<i<L 

where the expectation is taken over the random choices ri , . . . r^. 

The referee then uses inner product estimation IBC WdOll : for each copy, he performs a measurement on ® 

10^) to obtain a random vai'iable ^ e {0, 1} such that Pr[^i^„ ^ = 1] = ^^^'^''^1'^° ^1 ^ then he sets = 



y X]t=i -^/a b- Let = y 1 — 2Z^^ if Z^j^ < 1/2 and = otherwise. This serves as an approximation for 
p+(a, b\x, y) =1 {<i)l\(f)t) I' and Hoeffding's inequality then yields 

Vr[Ql^,>P+{a,b\x,v) + P] < e"^ , 
PiiQlb<P^ia,b\^^y)~P] < e-^. 

Let be an estimate forp~(a, b\x, y) obtained using the same method. The referee then obtains an estimate for 

p{a, b\x, y) = q+p+{a, b\x, y) - q-p^{a, b\x, y), by setting Qa,b = Q+Q^b + 1-Qa,b- such that 

P^[Qa,b>p{aMx,y)+P] < 2e-^, 



Pi[Qa^b<p{a,b\x,y)- 13] < 2e 



2Al 



Lemma |28l with e(/3) = 2e then implies that the referee may simulate a probability distribution p* such 
that (5(p^ p) < 2AB{(3 + 2e '^). Since S{p, p) < Aa, we need to pick T,L = ^ large enough so that Aa + 

2AB /3 + 2e-^'5V2A^] < ,5. Settings = J^, /? ^ T = 2^1n(l^) = 2^^ [^]^n(iMB) and L = 

^ = the total complexity of the protocol is 4^BT(log(i) + 2) = 0{{ABf [j]'^hi[^] iog{n)). (We may 

assume that j < n^/'*, otherwise this protocol performs worse than the trivial protocol.) □ 

In the case of Boolean functions, corresponding to correlations Cf{x, y) G {±1} (see Def.|2]), the referee's job is 
made easier by the fact that he only needs to determine the sign of the correlation with probability \ — 5. This allows 
us to get some improvements in the upper bounds. Similar improvements can be obtained for other types of promises 
on the distribution. 

Theorem 29. Let f : {0, 1}" x {0, 1}" — ^ {0, 1}, with associated sign matrix Cf, and e,5 < 1/2. 
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1. < 4 



l-2c 



'mi; 



2. < O log(n) 



l-2e 



Mi) 



From Lemmas [T6l and 1211 these bounds may also be expressed in terms of 73, and the best upper bounds are 
obtained from 7!° [Cf) — ^^M^Cf) ■ ^^^^ coincides with the upper bound of IILS09I . 

Together with the bound between v and 72 from Section |5l and the lower bounds on communication complexity 
from Section|3] Theorems l26landl29]immediatelv imply the following corollaries. 

Corollary 30. Let f : {0, 1}" x {0, 1}" ^ {0, 1}. For any e,6 < 1/2, ifQT\f) < q, then 
1. i?r"V)<i^^22^+Mn(i)^, 



2. Q\{f) <0 (\og{n)2^nn{\)j^ 

Let p C be a distribution with inputs in X x y with \X x y \ < 2", and outcomes in Ax B with A =^ \A\, B = \B\. 
For any e,S < 1/2, ifQf^p) < q, then 



3. i?il;7^p)<0(24.(^ln2[^]), 

4. gf^,(p)<o(28^(^h,[^]log(n) 



The first two items can be compared to results of Yao, Shi and Zhu, and Gavinsky et al. IIYao03l [SZ08I IGKd 06< . 
who show how to simulate any (logarithmic) communication protocol for Boolean functions in the simultaneous 
messages model, with an exponential blowup in communication. The last two items extend these results to arbitrary 
non-signaling distributions. 

In particular, Item 3 gives in the special case q — Q, that is, p G Q, a much simpler proof of the constant upper 
bound on approximating quantum distributions, which Shi and Zhu prove using sophisticated techniques based on 
diamond norms [SZ08 |. Moreover, Item 3 is much more general as it also allows to simulate protocols requiring 
quantum communication in addition to entanglement. As for Item 4, it also has new interesting consequences. For 
example, it implies that quantum distributions {q = 0) can be approximated with logarithmic quantum communication 
in the simultaneous messages model, using no additional resources such as shared randomness, and regardless of the 
amount of entanglement in the bipartite state measured by the two parties. 



7 Conclusion and open problems 

By studying communication complexity in the framework provided by the study of quantum non-locality (and beyond), 
we have given very natural and intuitive interpretations of the otherwise very abstract lower bounds of Linial and 
Shraibman. Conversely, bridging this gap has allowed us to port these very strong and mathematically elegant lower 
bound methods to the much more general problem of simulating non-signaling distributions. 

Since many communication problems may be reduced to the task of simulating a non-signaling distribution, we 
hope to see applications of this lower bound method to concrete problems for which standard techniques do not apply, 
in particular for cases that are not Boolean functions, such as non-Boolean functions, partial functions or relations. Let 
us also note that our method can be generalized to multipartite non-signaling distributions, and will hopefully lead to 
applications in the number-on-the-forehead model, for which quantum lower bounds seem hard to prove. 

In the case of binary distributions with uniform marginals (which includes in particular Boolean functions), 
Tsirelson's theorem (Theorem |7]| and the existence of Grothendieck's constant (Proposition |9]l imply that there is 
at most a constant gap between v and 72. For this reason, it was known that Linial and Shraibman's factorization 
norm lower bound technique give lower bounds of the same of order for classical and quantum communication (note 
that this is also true for the related discrepancy method). Despite the fact that Tsirelson's theorem and Grothendieck's 
inequality are not known to extend beyond the case of Boolean outcomes with uniform marginals, we have shown that 
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in the general case of distributions, there is also a constant gap between ;/ and 72. While this may be seen as a negative 
result, this also reveals interesting information about the structure of the sets of local and quantum distributions. In 
particular, this could have interesting consequences for the study of non-local games. 
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A Proof of Lemma [m 

The proof relies on the following observation: 

Claim 1. Let \'ipt) be the entangled state shared by Alice and Bob after the first t ~ tA + ts qubits of com- 
munication (tA bits from Alice to Bob, and ts bits from Bob to Alice). This state may be written as {ipt) = 
^^^J2Te{o,ly ^T\a^''')BT\l3'^''')' where J^i l^iP = 1. {1"^*^) : Vz e 1} and {\l3'^''> : Vi e I)} are orthonormal 
bases for Alice and Bob 's initial registers respectively and At , Bt are linear operators such that: 

• Aq, Bq are the identity operators on Alice and Bob 's initial registers, respectively, 

• At are linear operators acting on Alice's initial register and depending on her input only, satisfying '^rp^^^ \\At\'iPa)\\'^ = 
2*^ for all ( unit) state {tPa) of Alice 's register 

• Bt are linear operators depending on Bob's input only, satisfying X^Tefo i}* I^tIV's) IP = 2*-* for all (unit) 
state \iPb) of Bob's register 

Proof of Claim\T\ We prove this by induction over t. This is true for t — 0, since using Schmidt decomposition, we 
may write the initial entangled state shared by Alice and Bob, before the quantum communication protocol is initiated, 
as \tPo) = T,tei where J2i = 1 and {|q;(*)) : Vi G /} and {|/3(^) : Vi G /)} are orthonormal bases 

for Alice and Bob's registers respectively (as is, these are actually just orthonormal, but we can always obtain a basis 
by setting /i^ ~ for the missing basis vectors). 

If this is true for t — 1, then we have \ipt-i) = X^iG/ ^» ^Tefo ^T|a^'-')^T|/?'-'''), where 

Etg{o,i}*-i II^T|a(*))||2 = 2*« and ETe{o,i}*-i \\Bt\I3^''>)V = 2*-*-! for alH e / (we assume wlog that the 
Vs qubit is sent by Alice to Bob). Alice's operation at turn t will be to apply some unitary operation Ut on her 
register, then send one of the qubits in her register to Bob. By isolating this qubit, we define the linear operators 
Ato and Ati to be such that UtAT\a'^^^) = ylTo|a*^'-')|0) + ^Ti|a('))|l) for all i G /. Unitarity then implies that 
\\ATo\a^''>)f + MTi|a(^))p = \\AT\a^''>)f, and as a consequence ^^,^,0^^,, ||^T|a^''^)|P = 2*«. We then have 




Te{o,i} 



t-i 




(1) 




(2) 
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where, for all T G {0,l}*-\ we have defined linear operators Bto,Bti such that BtoI/?^'^) = \0)Bt\P'^^^) and 
_Bti|/5^*^} = |1)^t|/3'''^) for all i £ I, considering that the additional qubit is in Bob's hands at the end of turn t. 
Furthermore, we have \\Bto\P'^''')V + \\Bti\P^''>)V = 2||Bt|/3W) |p, and as a consequence ^j^g^o^^p ||Bt|/3''^)IP = 
2'^* , which completes the proof of our claim. □ 

Proof of Lemma\T7] At the end of the quantum communication protocol, Alice and Bob share a quantum state \ipq) 
satisfying Claim[T]for t = q. Alice and Bob then perform binary ({+1, — l}-valued) measurements A and B on their 
respective parts of the state. By orthonormality of the states I'lpq '^), we have for the correlation 

C - {-ipMBHi) (3) 
= E^»*^J- E {a^'MAl.AAu\a'-^^){l3^'^\BlBBuW^'^). (4) 

i,j<£l T,U<£{Q,l}i 

We may now define the vectors a{x) and b{y) in a 2^*|/p-dimensional complex vector space, with coordinates 

armjix) = I^J/^^T|a«), (5) 

brmjix) = fi,{f3^'^\BlBBu\f3^''>), V T, [/ e {0, i, j e /, (6) 

so that C — a{x) ■ b{y) . Moreover, using the fact that the | a*^^-* ) 's define an orthonormal basis for Alice's register and 
the property on the norms of the operators At, we have 

Hx)f = E l(«^^'^I^WT|a«)P (7) 

= EIa'^I' E ll4^^T|aW)|p (8) 
iei T,(7e{o,i}9 

< ll4l4^)lPll^T|a«)|P = 2^«-, (9) 

iei T,;7e{o,i}9 

where l^^"*) is the renormalized state AAxla^^^). So, we have ||a(a;)|| < 2''^, and similarly \\b{y)\\ < 2*^. □ 
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